Search CORE

20 research outputs found

The Design of Terra: Harnessing the Best Features of High-Level and Low-Level Languages

Author: DeVito Zachary
Hanrahan Pat
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 1st Summit on Advances in Programming Languages (SNAPL 2015)
Publication date: 01/01/2015
Field of study

Applications are often written using a combination of high-level and low-level languages since it allows performance critical parts to be carefully optimized, while other parts can be written more productively. This approach is used in web development, game programming, and in build systems for applications themselves. However, most languages were not designed with interoperability in mind, resulting in glue code and duplicated features that add complexity. We propose a two-language system where both languages were designed to interoperate. Lua is used for our high-level language since it was originally designed with interoperability in mind. We create a new low-level language, Terra, that we designed to interoperate with Lua. It is embedded in Lua, and meta-programmed from it, but has a low level of abstraction suited for writing high-performance code. We discuss important design decisions - compartmentalized runtimes, glue-free interoperation, and meta-programming features - that enable Lua and Terra to be more powerful than the sum of their parts

CiteSeerX

Dagstuhl Research Online Publication Server

The unexplained nature of reading.

Author: Adelman James S.
Estes Zachary
Marquis Suzanne J.
Sabatos-DeVito Maura G.
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/01/2013
Field of study

The effects of properties of words on their reading aloud response times (RTs) are 1 major source of evidence about the reading process. The precision with which such RTs could potentially be predicted by word properties is critical to evaluate our understanding of reading but is often underestimated due to contamination from individual differences. We estimated this precision without such contamination individually for 4 people who each read 2,820 words 50 times each. These estimates were compared to the precision achieved by a 31-variable regression model that outperforms current cognitive models on variance-explained criteria. Most (around 2/3) of the meaningful (non-first-phoneme, non-noise) word-level variance remained unexplained by this model. Considerable empirical and theoretical-computational effort has been expended on this area of psychology, but the high level of systematic variance remaining unexplained suggests doubts regarding contemporary accounts of the details of the mechanisms of reading at the level of the word. Future assessment of models can take advantage of the availability of our precise participant-level database

Crossref

Archivio istituzionale della Ricerca - Bocconi

Warwick Research Archives Portal Repository

Opt: A Domain Specific Language for Non-linear Least Squares Optimization in Graphics and Imaging

Author: Bernstein Gilbert
DeVito Zachary
Fisher Matthew
Hanrahan Pat
Mara Michael
Nießner Matthias
Ragan-Kelley Jonathan
Theobalt Christian
Zollhöfer Michael
Publication venue
Publication date: 01/01/2016
Field of study

Many graphics and vision problems can be expressed as non-linear least squares optimizations of objective functions over visual data, such as images and meshes. The mathematical descriptions of these functions are extremely concise, but their implementation in real code is tedious, especially when optimized for real-time performance on modern GPUs in interactive applications. In this work, we propose a new language, Opt (available under http://optlang.org), for writing these objective functions over image- or graph-structured unknowns concisely and at a high level. Our compiler automatically transforms these specifications into state-of-the-art GPU solvers based on Gauss-Newton or Levenberg-Marquardt methods. Opt can generate different variations of the solver, so users can easily explore tradeoffs in numerical precision, matrix-free methods, and solver approaches. In our results, we implement a variety of real-world graphics and vision applications. Their energy functions are expressible in tens of lines of code, and produce highly-optimized GPU solver implementations. These solver have performance competitive with the best published hand-tuned, application-specific GPU solvers, and orders of magnitude beyond a general-purpose auto-generated solver

arXiv.org e-Print Archive

MPG.PuRe

MAD Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems

Author: Acun Bilge
Ardalani Newsha
Brooks David
DeVito Zachary
Golden Alicia
Hsia Samuel
Wei Gu-Yeon
Wu Carole-Jean
Publication venue
Publication date: 18/10/2023
Field of study

Training and deploying large machine learning (ML) models is time-consuming and requires significant distributed computing infrastructures. Based on real-world large model training on datacenter-scale infrastructures, we show 14~32% of all GPU hours are spent on communication with no overlapping computation. To minimize the outstanding communication latency, in this work, we develop an agile performance modeling framework to guide parallelization and hardware-software co-design strategies. Using the suite of real-world large ML models on state-of-the-art GPU training hardware, we demonstrate 2.24x and 5.27x throughput improvement potential for pre-training and inference scenarios, respectively

arXiv.org e-Print Archive

First-class runtime generation of high-performance types using exotypes

Author: Alex Aiken
Corliss G.
Daniel Ritchie
Gamma E.
Goodman N. D.
Lamping J.
LeCun Y.
Matt Fisher
Neal R.
Norvig P.
Pat Hanrahan
Wingate D.
Zachary DeVito
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Surface Deformation Methods for Describing and Visualizing Protein Binding Sites

Author: DeVito Zachary
Publication venue
Publication date: 01/01/2008
Field of study

Dataspace

Just-in-time Length Specialization of Dynamic Vector Code

Author: Justin Talbot
Pat Hanrahan
Zachary Devito
Publication venue
Publication date: 05/03/2020
Field of study

Abstract Dynamically typed vector languages are popular in data analytics and statistical computing. In these languages, vectors have both dynamic type and dynamic length, making static generation of efficient machine code difficult. In this paper, we describe a tracebased just-in-time compilation strategy that performs partial length specialization of dynamically typed vector code. This selective specialization is designed to avoid excessive compilation overhead while still enabling the generation of efficient machine code through length-based optimizations such as vector fusion, vector copy elimination, and the use of hardware SIMD units. We have implemented our approach in a virtual machine for a subset of R, a vector-based statistical computing language. In a variety of workloads, containing both scalar and vector code, we show near autovectorized C performance over a large range of vector sizes

CiteSeerX